Robust Techniques for Organizing and Retrieving Spoken Documents
نویسندگان
چکیده
منابع مشابه
Robust Techniques for Organizing and Retrieving Spoken Documents
Information retrieval tasks such as document retrieval and topic detection and tracking (TDT) show little degradation when applied to speech recognizer output. We claim that the robustness of the process is because of inherent redundancy in the problem: not only are words repeated, but semantically related words also provide support. We show how document and query expansion can enhance that red...
متن کاملSpeech Recognition and Information Retrieval: Experiments in Retrieving Spoken Documents
The Informedia Digital Video Library Project at Carnegie Mellon University is making large corpora of video and audio data available for full content retrieval by integrating natural language understanding, image processing, speech recognition and information retrieval. Information retrieval of from corpora of speech recognition output is critical to the project’s success. In this paper, we out...
متن کاملThematic indexing of spoken documents by using self-organizing maps
A method is presented to provide a useful searchable index for spoken audio documents. The task diiers from the traditional (text) document indexing, because large audio databases are decoded by automatic speech recognition and decoding errors occur frequently. The idea in this paper is to take advantage of the large size of the database and select the best index terms for each document with th...
متن کاملRobust retrieval models for false positive errors in spoken documents
How to deal with speech recognition errors and out-ofvocabulary (OOV) words, which are referred to as false negative errors, are common challenges in spoken document processing. To deal with them in spoken content retrieval (SCR), the SCR method that incorporated spoken term detection (STD) as the pre-process stage (referred to as STD-SCR) has been proposed. However, the STD-SCR tends to increa...
متن کاملThe MERL SpokenQuery information retrieval system a system for retrieving pertinent documents from a spoken query
This paper describes some key concepts developed and used in the design of a spoken-query based information retrieval system developed at the Mitsubishi Electric Research Labs (MERL). Innovations in the system include automatic inclusion of signature terms of documents in the recognizer’s vocabulary, the use of uncertainty vectors to represent spoken queries, and a method of indexing that accom...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: EURASIP Journal on Advances in Signal Processing
سال: 2003
ISSN: 1687-6180
DOI: 10.1155/s1110865703211070